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Abstract 

This paper focuses on the coauthor effect in different types of publi¬ 
cations, usually not equally respected in measuring research impact. A 
prion unexpected relationships are found between the total coauthor core 
value, rtia, of a leading investigator (LI), and the related values for their 
publications in either peer review journals [j) or in proceedings (p). A 
surprisingly linear relationship is found: mi^Mo.4 — rri'a^\ Further¬ 
more, another relationship is found concerning the measure of the total 
number of citations, Aa, i.e. the surface of the citation size-rank histogram 
up to TUa- Another linear relationship exists : -|- 1.36 A^a^ = A'^^K 

These empirical findings coefficients (0.4 and 1.36) are supported by con¬ 
siderations based on an empirical power law found between the number of 
joint publications of an author and the rank of a coauthor. Moreover, a 
simple power law relationship is found between nia and the number (tm) 
of coauthors of a LI: rria — r'^-, the power law exponent p, depends on the 
type (j or p) of publications. These simple relations, at this time limited 
to publications in physics, imply that coauthors are a ’’more positive mea¬ 
sure” of a principal investigator role, in both types of scientific outputs, 
than the Hirsch index could indicate. Therefore, to scorn upon co-authors 
in publications, in particular in proceedings, is incorrect. On the contrary, 
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the findings suggest an immediate test of coherence of scientific authorship 
in scientific policy processes. 


1 Introduction 

In recent years, studies of complex systems have become widespread among the 
scientific community, specially in the statistical physics one. Many examples, 
eg-, [DEls], pertain to social phenomena in general, indicating that physi¬ 
cists have gone far from their traditional domain of investigations m 0 El [7]. 
Moreover, one very modern topics of investigation is the role of measuring (as 
accurately and objectively as possibly done as in physics) the value of some 
scientific production [HIE]- 

In |l()j . it was shown that a Zipf-like law 

J oc 1/r, (1) 

exists, between the number (J) of joint publications (NJP) of a scientist, called 
for short ’’leading investigator” (LI) with her/his coauthor(s) (CAs); r =1,... 
is an integer allowing some hierarchical ranking of the CAs; r = 1 being the 
most prolific coauthor with the PI. The number of different coauthors (NDCA) 
is given by the highest possible rank tm- Several CAs have often the same NJP 
with the LI. 

It was observed that a hyperbolic (scaling) law is more appropriate, i.e., 

J = Jo/r“, (2) 

with a yf 1, usually such that a < 1, and often decreases with the number of 
CAs or with the number of joint publications, e.g. when the number of CAs 
and when J are ’’not large”. Jq is a fit parameter, i.e. there is no meaning to 
r = 0. 

As the /i-index mnnis] ” defines” the core of papers of an author from 
the relationship between the number of citations Uc and the corresponding rank 
r of a paper, through a trivial threshold, i.e. if Uc > Cc, then r^ = h, thus one 
is allowed also to define the core of coauthors of a scientist through a threshold 
uni, called the rria-index. 

Too = r, as long as r < J. (3) 

This is a specific measure of the core of the most relevant CAs in a research team, 
centered on the LI. In brief, in the h—index method, one implicitly assumes 
that the number of ” important papers” of an author, those which are the most 
often quoted, allows to measure the impact of a researcher [HI [m ng Hi]. 
No need to discuss lengthily the /i-index power, variants, or defects. However, 
such a citation effect is often due to the activity of a research team, centered 
on the LI m da uni mi- In fact, the size and structure of a temporary or 
long lasting group is surely relevant to the productivity of an author [T6|. In 
contrast, the rua index as introduced measures the role of coauthors, rather than 
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citations, to indicate the most important coworkers of a LI, allowing to measure 
the LI team core. Technically, one could thus measure the relevant strength of 
a research group centered on some leader and measure some impact of research 
collaboration, e.g., on scientific productivity [55]. The invisible college [551 [51] 
of a PI would become visible, easily quantified, whence pointing out to some 
selection in the community. 

Several other measure definitions can be deduced, as in the /i-method, i.e. 
taking into account the whole surface of the histogram, i.e. the cumulated 
number of joint publications (NJP) 


rM 

, (4) 

r—1 

for the CA with rank r has published Jr publications with the LI. A often 
discussed part of the histogram is that up to the treshold; it corresponds to the 
cumulated NJP limited to the core, i.e. 


Aa — ^ ^ Jr 


r—1 


(5) 


The notation is reminiscent of the A—index |551|5S1|57], in the Hirsch scientific 
output measurement method of an author. Of course, A^/X) gives the relative 
weight of the core CAs in the cumulated NJP. 

Moreover, one can define an Oa-index |10j which measures the surface below 
the empirical data of the number of joint publications till the CA of rank rua, 
normalized to iria, i.e. 


= -Y.Jr ^ 

rria “ rria 

r—1 


( 6 ) 


and similarly the index 


Om 


m _ m _ 


r—1 


(7) 


measured from the whole histogram surface. Obviously, A^/X) = The 

notations are similar to those of the /i-index scheme, where they somewhat 
measure the average number of citations of papers in the Hirsch core |13j . 

Note that the true mean ^ of the J vs. r distributions, i.e. the average NJP 
per CA, is obtained from 

E _ E 

^ “ {NDCA) ^ ^ ’ 

In practical terms, these indirect measures are attempts to improve the sen¬ 
sitivity of the threshold forced index in order to take into account the number of 
co-authors whatever the number of joint publications among the most frequent 
coauthors, and introduce a contrast between the most frequent CAs and the less 
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frequent ones. Indeed, JP have often a mix of different CA^ [55]. It has also 
been observed in previous fits, through Eq.([^, that unusual (long or short) lists 
of coauthors, as well as the hapax-like CA, i.e. accidental or rare CAs, but with 
necessarily large r values, have much influence on Jg and a, and the resulting 
i?2. 

Moreover, it is somewhat commonly accepted that proceedings papers, e.g. 
resulting from conference presentations, have to be distinguished from peer re¬ 
view journal publications. Miskiewicz |29| has discussed whether such different 
types of publications have some impact on the core number and on the ranking 
of CAs. For completeness, note that a complementary question was also exam¬ 
ined, i.e. whether a ’’binary scientific star”-like system implies some deviation 
from Eq.([^, - the ’’binary scientific star” (BSS) being defined as the couple 
formed by a LI and one of his most frequent CAs [30]. 

In the following sections, an amazingly simple relationship is reported to be 
found between rria^'^ and its related value for publications in peer review journals 
(j) and in proceedings (p), i.e. rria^ -\- 0.4 m'a^ = rria^K Moreover, another 
relationship is found concerning the Aa index, i.e. the surface of the J vs. r 
histogram up to TTia, i.e., Aa^ + l.S6 A^a'^ = Aa^\ A discussion of other empirical 
(linear) relations is presented. The illustrative data of the coauthorship features 
is quickly recalled for the few published cases, in Sect. In Sect. some hint is 
presented on some origin of the, surprising (or unexpected), relationship, and for 
the coefficient values. The case of anomalous data points is also discussed. Some 
justification is based on the empirical power laws oc ', |I0j . emphasizing 
that depends on the type (t) of publication, even if only slightly. A bonus 
(?) is found to be the simple power law relationship between the core value 
and the number of different coauthors, see an Appendix. 

Note that there is at first no reason to predict that a simple relationship will 
be found between the various quantities here above introduced. In fact an ex¬ 
amination of the distributions led to ambiguous results [30]. There is apparently 
no other previous investigation of this matter. In fine., modeling likely requests 
much more thinking. Sect, [^serves as a conclusion on the respective relevance 
of different types of publications in ’’evaluating” a LI and his/her CAs, and 
with some suggestion for future work. Nevertheless, the findings could imply 
practical considerations on subsequent measurements of publication activities 
during a career, -as self-citations might do |31j . 

Note also that several other so called laws have been predicted or discovered 
about relations between number of authors, number of publications, number of 
citations, fundings, dissertation production, citations, or the number of journals 
or scientific books, time intervals, etc. [32]. 

^The order of authors is at this level not discussed. 
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Figure 1: Empirical proof of the linear relationship between the total CA core 
value and the corresponding ones, but distinguishing between peer review 
journals (j) and ’’proceedings” (p) papers, i.e. ma^ + 0.414 ~ rua^^; 

E? = 0.894; the dash line indicates the best possible two parameter linear fit; 
several data points overlap each other 
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Figure 2: Empirical proof of the linear relationship, — Aa'^ + 1.36 A^a\ 
between A^’s, i.e. the (Number of joint publications-Coauthor rank) histogram 
J vs. r surface below m, and the A^ corresponding values for peer review 
journals (j) and ’’proceedings” (p); = 0.998; the dash line indicates the best 

possible two parameter linear fit; several data points overlap each other 
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Figure 3: Empirical linear relationship, ~ + 0.225 a'^, between 

the whole om’s and the corresponding ones for joint papers with coauthors in 
peer review journals (j) and ’’proceedings” (p); B? ~ 0.581; if either the (DG) 
and the (AP) and (MM) points are not considered in the fit, as being possible 
’’outliers”, the relationship numerical values are slightly modified. N.B. The 
best possible linear two parameter fits are not shown though have a higher E?, 
see text for values, but the abscissa at the origin can hardly be interpreted 
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Figure 4: Empirical proof of the linear relationship, Oq ~ Oa ^ +0.503 , be¬ 

tween Oa values resulting from the (Number of joint publications (jp)-Coauthor 
rank) histogram, J vs. r, surface below the core value iria, and that of peer 
review journals (j) and ’’proceedings” (p); = 0.957, when (JM) and (HSSB) 

points are not considered, as being outliers, see text; inclusion of such points are 
shown by the blue dash line; the best linear two parameter fits are not shown 
because the abscissa at the origin can hardly be interpreted 
































2 Data sample 

For the following study and discussion the same Lis as those investigated in 
previous publications [lOl [28l [29l [30] are considered. The Lis span a large 
range of scientific research topics, though in statistical physics mainly. They 
are mentioned by their initials. Most of them are males, except two. They 
come from mainly Poland, 7, i.e. RW, JMK, AP, KSW, JM, and MM; 4 are 
from the ’’western world”, HES, DS, MA, and PC. They are half several senior 
(JMK, AP, HES, DS, MA, PC) and half rather junior scientists. In previous 
reports [101 EH EHl EQ] , their publication list has been summarized and is thus 
not recalled here. Beside the Lis, 4 BSS cases [50], i.e. so called HSSH, HSSB, 
MARC and MANY has been made for further completing the up to-day rather 
rare data. This leads to examine 15 cases. A priori, the data does not seem to 
be specifically biased. 

The best power law fits, through Eq.(2), a and Wa value, and the distribution 
main statistical characteristics, i.e., the mean /i and tm, are given in Table 
1. This well illustrates the similarity in behavior, but points to differences to 
be examined next in more detail. One may expect, from a general point of 
view, that the subsets, i.e. joint publications in peer review journals (j) and in 
proceedings (p), might have some influence on characteristics of those for the 
whole (jp) set since they form the structure. However, the problem is highly 
non linear in essence, since the ’’rank” is not a usual variable. Nevertheless, in 
line with modern statistical analysis, and in order to detect some substructure, 
several fits can be attempted, i.e. power law, exponential, logistic, ... and 
polynomial, the most simple being the linear one. 

The deduced values of Aa, aM and Oa are given for the three sets, i.e. 
(jp)) (j) and (p), in Table 2. The empirically found laws are presented in Figs. 
1-4, for the rua, Aa, um, OLa quantities. 

A simple relation is found between the core measures, i.e. 

+ 0.414 (9) 

with a high regression coefficient R^, i.e. ~ 0.894, and between the histogram 
surfaces below the corresponding core measures 

A« + 1.36 = A(JP'> (10) 

with a very high regression coefficient R^, i.e. ~ 0.998, as seen in Figs. 

[^ respectivefy. In both cases, a classical two parameter linear fit has been 
attempted. It has been found that rria^^ = rria^ + 0.452 — 0.321 and 

Aa^'^ = Aa'^ + 1.354 A^'^ + 1.917 respectively, with the corresponding R^ values 
being equal to 0.904 and 0.998. Such fits are shown in Figs. m 

Let it be observed that one has necessarily 

sO) + 5](p) = sO'p) (11) 

which is nothing else that a normalization condition on the NJP, as exemplified 
in Table 2. 
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Subsequently, om has been examined. The result is reported in Fig. One 
finds 

+ 0.225 at] = , (12) 

but with a low E? ~ 0.581. More discussion, stemming from the presence of 
anomalous data, so called ’’outliers”, is found in Sect. 

Finally, a fine linear fit is found for Oa as seen in Fig. 

+ 0.503 aa , (13) 

with B? ~ 0.957, even with the presence of anomalous (outlier) data, as dis¬ 
cussed in Sect. [H 

3 Discussion 

3.1 Outliers 

First let us discuss the cases of om and Oa, before introducing an estimate of 
the numerical coefficients. Visual inspection shows that there are a few data 
points looking like outlier^in Fig. |^and Fig. They are called (DG), (AP), 
and (MM) on one hand and (JM) and (HSSB) on the other hand. 

For the om case. Fig. it is remarked that (DG) has a specially high 
number of CAs, i.e. 93, having only one (’’proceedings”) joint publication with 
DG. This stems from some work by DG in high-energy physics before he turned 
towards statistical mechanics research. If such a data point is included in a 
proportionality fit between — a^] and a ^], one obtains a proportionality 
coefficient equal to 0.335, and a value equal to 0.312. Concerning (AP) and 
(MM), it is observed that — 0, which may at first appear awkward and 

unattractive. Note that the respective values of ^ and nia seem ’’reasonable”. 
However, the origin of the negative value is likely attributable to the fact that 
AP and MM have very few ’’proceedings” papers, having mainly concentrated 
their research output into peer review journals, and few papers resulting from 
scientific meetings. In some sense, through these two authors, one point out to 
the effect of duplicate-like papers. 

The comment on the” (DG) effect” implies that one should deduce that the 
proportionality coefficient is likely to be dependent on the research field. On the 
other hand, the comment on the ”(AP)-(MM) effect ’’indicates that one should 
allow for a negative value of — a^], and propose that the coefficient to be 
accepted is 0.228 rather than 0.225. 

For completeness, a classical two parameter linear fit has been made either 
taking into account all data points, or removing outliers. The following relations 
have been obtained, for the om 

• taking into account all data points, y = 0.255 -I- 0.33x, with = 0.349, 

^Interesting comments on outliers can be found in [331 [3l] 
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• without the (DG), (AP) and (MM) points, y = 2.964 + 0.164x, with 
i?2 = 0.769, 

• without the (DG) point, y = 1.232 + 0.197a;, with = 0.718. 

For the Oq cases, see Fig. W HSSB is seen to have a very high Oa — Oa ^ 

'—' (70) ('7') 

positive value, while for JM , one obtains ajg — < 0. It is fair to emphasize 

that there is a similarity between 0 ^ and om cases. However, the deductions 
originate from different surfaces, i.e. below the cores in the Oq cases, to be 
contrasted with the whole surfaces for the om cases. 

Since HSSB have almost equal NJP, with their main GAs, in either p or j set, 
one might wonder if these are duplicate results, since they imply the same and 
main GAs. The case of JM shows that this is not a duplicate finding: indeed, 
on the contrary, JM has very few p-type publications with his/her main GAs. 
Moreover, the different ’’behavior” of HSSB and JM enlightens the fact that 
JM has rather a concentration of scientific output in peer review journals with 
his/her main GAs (like for AP and MM, in fact). 

For completeness, a classical two parameter linear fit has been made taking 
into account all data points, or removing outliers. The following relations have 
been obtained, for the om 

• taking into account all data points, y = —0.671 +0.608a;, with B? = 0.775, 

• without the (HSSB) and (JM) points, y = —0.07 + 0.507a;, with = 
0.957. 


Note that one should not be impressed by the respective R? values, - obvi¬ 
ously depending on the number of data points, and the fact that these are two 
parameter fits, - in contrast to the values given here above and in either Eq.( 121 
or Eq.(13). 

The positive or negative value of the abscissa at the origin can be attributed 
to the fact that the resulting combination (jp) between the (j) and (p) set is a 
priori highly non-linear. Indeed the various ranks do not sum up, since a GA in 
one set may appear at two rank values totally unrelated to the resulting rank 
for the (jp) set. 

Finally, in all cases, one may deduce that these (outlier-like) results arise 
from different scientific (or other) behavior of the respective scientists, but this 
discussion is outside the realm of the present paper. 


3.2 Theoretical estimates 

The numerical proportionality coefficients, as well as those resulting from a two 
linear parameter fit, can be discussed, going to the continuum limit for the 
respective histograms. Indeed, within a continuum approximation, one has 

f J(r)dr= j ^ dr = Jo - 1]. (14) 
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Consequently, 


rM 


r—1 

(15) 

rria 


Aa = ^Jr -t = Jo - 1]. 

(16) 

r—1 


s ^ = A [,(;-) _ 1|, 

rUa ma 

(17) 

— ^ = Jo -TO-1]. 

(18) 


Recall that these quantities depend both on the type of publication set and 
on the LI. For example, for one specific LI, and some publication set, one has 


from Eq.( 16), 


{Aa/Jo) 


while from Eq.(I5), one finds 


Observe that 






,rM 

'm, 



(19) 

~ ' M • 

(20) 


(21) 


Aa + Jo 

for each LI and each type of scientific publication. Some elementary, but very 


tedious algebra, can follow. From Eq.(21), one can extract and rewrite 
explicitly Eq.( 11), in order to obtain a linear relationship between the three Aa 


quantities, such that one can write 


Ai^A = + A, 


( 22 ) 

where each A and [ ... ] are functions of vm and nia- Moreover, as shown in 
Appendix, one has a (surprisingly simple) power law relationship between 
and Cm, i-C. nia = v r^, (or tm = u m\), see Figs. 5]|6 Therefore, one can 


evaluate the ratio Ap/Aj for the various cases; it is about Vp/vj ~ (2.9/5.0)“^^/^^ 
~ 1.44, not too far from 1.36. The complicated [ ... ] term can be roughly 
estimated, according to the various numerical values. It is found rather small 


with respect to the other terms, whence corroborating the finding in Eq.(lO). 

To ’’verify” Eq.([^ is more subtle and complicated. Indeed, one has to start 
from Eq. in which one substitutes each Eq.([T^. This leads to a highly non 
linear relationship of the type 




where 


Ma = m 


7(l-«) 

a 


(23) 

(24) 
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To extract a linear relationship, analytically, like rria^'^ = uij rria'^ + ojp rn'a'^ + 
[ .... ], similar to Eq. ([^ seems feasible only if 7(1 — a) = {l/k), where k is an 
integer. For the sake of argument, taking 7 = 3/2 and a = 1/3, thus k = 1, one 
can estimate according to the values in Tables 1-2 and those in the i^pendix 
that the ratio ujp jujj ~ — 0.5, not too far from 0.4 in Eq. (j^. 

4 Conclusions 

In summary, recall that the core nia of coauthors (CAs) of a leading investigator 
(LI) is well defined through a criterion similar to the /i-index. Through a J vs. r 
histogram, one describes a CA ’’impact” according to the number of his/her joint 
publications J with a LI. It is usually considered that scientific publications in 
proceedings differ, in various ways, ’’values”, from those in peer review journals. 
This belief was here above questioned. In fact, one can distinguish the core of 
coauthors of a LI, according to the type of joint publications. Next, it was 
wondered whether the relative hierarchy in estimating the value of publications 
in journals or proceedings can be carried to the core of co-authors. Finally, 
the question was raised: ”is there any numerical proof of the usual belief in a 
qualitative difference” ? 

Visually, it appears that the hierarchy exists, i.e. > mj), but, 

somewhat surprisingly, the relationship turns out to have a simple analytic form: 
rria ^ + 0.4 m^a'^ = ma^\ Moreover, another relationship is found concerning 
the Aa index, i.e. the surface of the J vs. r histogram up to Wa, i.e., Aa'^ + 
1.36 A^'^ = Aa^\ These linear relationships also hold for subsequently derived 
measures. 

The findings have been illustrated gathering data for a dozen or so Lis, and 
for 4 couples, i.e. publications in which a LI is systematically with a specific 
CA. Even though the data, about scientists working in the research field of 
statistical physics, is of finite size, the result does not seem to be biased. A 
test indicates the reliability of the found features. Yet, in one case, a LI, having 
previously worked in high energy physics and having many (93) CAs for one 
publication, of the (p)-type, some anomalous behavior occurs. This outlying 
feature should not distract from the main findings. 

The main text analysis points out to the interest of the measures of CA cores 
according to types of publications. The results of course suggest to investigate 
other scientific domains. In fact, it can be done through a bonus, the discovery 
that rria is tied with the maximum number of CAs of a LI, i.e. tm, through a 
simple power law. In so doing, one may observe that outliers are easily found 
[33l [34], - thus removing them leads to a large increase in the coefficient, 
in fine giving much weight to the interest of such numerical findings. It is also 
obvious that the analysis is rather simple for policy makers, - when the list of 
publications is available. 

Theoretical work, whence explanations, have been shown not to be trivial, 
because the systems are highly non-linear. A CA might appear in one type (j or 
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Figure 5: Empirical proof of the power law relationship, rua — resulting 
from the (Number of joint publications (jp)-Coauthor rank) histogram, J vs. r, 
surface and that for peer review journals (j) and ’’proceedings” (p), when the 
(DG) point is not considered, as being an outlier 
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Figure 6: Empirical proof of the power law relationship, tm — when the 
outlier (DG) point is not considered, for the cases of peer review journals (j) 
and ’’proceedings” (p), and the total (jp) number of joint publications 
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p) of publications, but not in the other type. However, as it is more frequently 
seen, a CA appears in both types of publication.However, it might be at a quite 
different ranking. This is the case when a CA does not have ’’many” publications 
with a LI. In fact, the ranking of some CA in some type of publication does not 
simply add up with the ranking in another type of publications. A CA ranking 
is usually not conserved, but depends on the publication type. Whence, there 
is a priori no reason why a linear relationship should be found between such 
measures. 

However, there is one ’’normalizing condition”, in general not appreciated, 
which imposes further consideration of such co-authorship measures: the surface 
of the histogram J vs. r for the whole set of publications is necessarily the sum 
of the surfaces for the two types of publications. Thus, since a the continuum 
limit, J (X I/r“, one can measure such surfaces as a function of a and tm, and 
later derive an estimate of the numerical coefficients obtained through empirical 
fits. 

Finally, one should not simply consider that these are numerical games. 
They lead to remarkable proportionality measures of CA roles. It appears that 
the number of CAs ”of interest” for measuring the core of CA of a LI is mainly 
arising from the joint publications in peer review journals. Indeed, only about 
(0.4/1.4 ~) 30% stem from ’’proceedings”. Similarly, it appears that the ’’con¬ 
tribution” to the number of joint publications by the main CAs is about 50% 
of the whole. 

This implies to elaborate practical considerations on subsequent measure¬ 
ments of publication activities during a career, on the role and effects of coau¬ 
thors [sg Eg [37]. The case of outliers is also of interest, since it carries some 
weight in estimating a and subsequent numerical coefficients. 

These simple relations imply that coauthors are a ’’more positive measure” 
of a principal investigator role, in both types of scientific outputs, than the 
Hirsch index which barely counts the number of citations independently of the 
co-author (number nor a fortiori rank). Therefore, to scorn upon co-authors in 
publications, in particular in proceedings, is highly incorrect. On the contrary, 
the findings suggest an immediate test of coherence of scientific authorship in 
scientific policy processes. This could imply many practical considerations on 
the role of CAs with respect to a LI and on the respective roles of different 
types of publications in ’’measuring” a LI team work. A discussion of criteria 
based on the above for estimating, e.g the financing of a LI or a team, is outside 
the realm of the present paper. Nevertheless, with softwares actually available 
to policy makers, the development and application of such findings should be 
easily possible. 

Note three final points: (i) each rank-frequency form, like Eq.(I) or Eq. (2), 
has an equivalent size-frequency one [38]. One should become curious about 
whether similar equalities hold for the size-frequency cases; (ii) the above con¬ 
siderations suggest to investigate if similar relationships exist for the h-index, 
distinguishing between and and to draw lad hoc conclusions; (iii) 

complex systems do not necessarily lead to find non linear laws. 
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Appendix 

Other empirical laws: TOq vs. vm and tm vs. nia, with or without 
outliers 


Eor the theoretical estimation of the numerical values in Eqs. (l§-@, in 
particular in order to proceed beyond Eq.(21|, it appears that it is useful to 


find whether a relationship exists between tm and ma on average. 

The most simple power law fits for iria vs. vm, i.e. TOq ~ r^,j, give, taking 
into account all data points, 

'■645) , with 0.893 
629) , with i? 2 = 0.978 
■415) , with R 2 = 0.760 

'■667) , with i? 2 = 0.946 
)j(0.629) ^ ^2^ 0.978 

’■460) ^ with R2= 0.917 
when the outlier (DG) data point is not taken into account, i.e. roughly 
speaking rria — 0.5 mj, with 7 ~ 1/2 or 2/3. Similarly, the best power law fits 
for vm vs. rria give, when taking into account all data points, 

= 8.691 [miii’^](i ii' 6 ), with R^= 0.893 
= 5.013 [mi^^^](i^483), with 0.969 

,(P)1 



= 0.392 



= 0.405 

\J3) 
V M 

](° 

rrSa'’ 

= 0.904 

r (p) 

I'm 

](0 

but lead to 




= 0.393 

['■IfY' 


= 0.405 

\J3) 
V M 

](° 


= 0.842 

r (p) 

I'm 

](0 


' M 


= 4.261 with R2= 0.867 


' M 

but become 
= 4.589 

rU) 

' M 

JP) 

' M 


= 5.013 

= 2.871 


J^Op)] (1.438) ^ ^2^ 0.936 

[mi^ii]ii 466) ^ with R^= 0.969 
j^/p)] (1.683) ^ ^ 2 ^ 0.960 


when the outlier (DG) is not taken into account, i.e. roughly speaking 
Tm — 4.5 mf, with j3 ~ 3/2 or 5/3. The ”no (DG)” cases are shown in Figs. 
5]|6 for illustration of the findings. Note the large R? values. 
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m-a ^ 

(P) 

m>a' 

aUp) 

aU) 


^Up) 



S3P) 

' M 

Aj) 

' M 

Ap) 

' M 

HES 


26 

20 

15 

1.135 

0.999 

1.045 

6.569 

4.67 

5.136 

592 

568 

242 

DS 


12 

12 

3 

0.796 

0.535 

0.688 

2.725 

2.578 

1.565 

280 

268 

46 

MA 


20 

15 

10 

1.102 

1.029 

0.86 

4.872 

3.865 

3.041 

319 

273 

168 

PC 


4 

3 

3 

0.87 

0.94 

0.67 

3.097 

2.684 

2.0 

302 

129 

24 

RW 


6 

4 

4 

0.743 

0.767 

0.561 

2.75 

1.94 

2.71 

46 

34 

23 

JMK 


5 

4 

3 

0.787 

0.702 

0.618 

2.707 

1.714 

2.04 

41 

35 

25 

AP 


6 

5 

2 

0.94 

0.64 

0.89 

2.872 

2.622 

1.5455 

47 

45 

11 

DG 


2 

2 

2 

0.547 

0.755 

0.239 

1.13 

1.75 

1.05 

104 

7 

99 

KSW 


3 

3 

1 

0.715 

1.255 

0.594 

2.13 

2.0 

1.67 

21 

21 

3 

JM 


2 

2 

1 

0.63 

0.67 

0.67 

1.75 

1.71 

1.33 

14 

12 

1 

MM 


3 

2 

2 

0.536 

0.428 

0.521 

1.515 

1.3125 

1.45 

33 

16 

20 

HSSH 


16 

11 

10 

1.074 

0.934 

0.974 

5.602 

3.87 

3.895 

196 

169 

114 

HSSB 


15 

10 

10 

1.064 

0.922 

0.969 

5.104 

3.549 

3.766 

214 

176 

114 

MARC 


11 

9 

7 

0.985 

0.893 

0.887 

3.81 

3.07 

2.958 

147 

114 

71 

MANY 


5 

3 

3 

0.835 

0.755 

0.67 

2.60 

2.143 

1.833 

40 

28 

24 


Table 1: Summary of direct data values for 11 Lis and 4 BSSs: rria is the 
core measure m-, a is the exponent of the empirical power law, Eq. § ; M is 
the mean of the distribution (J vs. r); tm the total number of different CAs 
(NDCA); always distinguishing among of joint publications, the total (jp) sum, 
the journals (j) and the ’’proceedings” (p) 
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Up) 


(P) 

y;(ip) 

SO) 

Y,(p) 



-^a 



(P) 

aa 

HES 


149.6 

131.95 

83.3 

3889 

2639 

1250 

1625 

895 

549 

62.5 

44.75 

36.6 

DS 


63.58 

57.58 

24 

763 

691 

72 

259 

229 

16 

21.58 

19.08 

5.33 

MA 


77.85 

70.33 

50.2 

1557 

1055 

502 

810 

482 

221 

40.5 

32.13 

22.1 

PC 


24.75 

17 

16 

99 

51 

48 

46 

29 

16 

11.5 

9.67 

5.33 

RW 


21.5 

16 

16.25 

129 

64 

65 

64 

21 

33 

10.67 

10.2 

8.25 

JMK 


22.2 

15 

17 

111 

60 

51 

39 

21 

16 

7.8 

5.25 

5.33 

AP 


22.5 

23.6 

8.5 

135 

118 

17 

64 

51 

7 

10.67 

10.2 

3.5 

DC 


59 

7 

52 

118 

14 

104 

14 

7 

7 

7 

3.5 

3.5 

KSW 


16.33 

14.67 

5 

49 

44 

5 

18 

15 

3 

6 

5 

3 

JM 


13.5 

11.5 

2 

27 

23 

2 

12 

10 

3 

6 

5 

3 

MM 


16.67 

10.5 

14.5 

50 

21 

29 

16 

7 

8 

5.33 

3.5 

4 

HSSH 


68.625 

59.45 

44.4 

1098 

654 

444 

524 

236 

204 

32.75 

21.45 

20.4 

HSSB 


72.53 

62.1 

46.7 

1088 

621 

467 

469 

202 

191 

31.27 

20.2 

19.1 

MARC 


50.91 

38.89 

30 

560 

350 

210 

280 

151 

97 

25.45 

16.778 

13.86 

MANY 


20.8 

20 

14.667 

104 

60 

44 

52 

26 

19 

10.4 

8.667 

6.33 


Table 2: Summary of indirect data values for 11 Lis and 4 BSSs: om = 
where nia is the CA core measure; S is the surface below the histogram (J 
vs. r), i.e. TNCA; is the surface below the J vs. r histogram, limited to 
the core value iria', ua = Aafma', each value for the total (jp) number of joint 
publications, journals (j) or ’’proceedings” (p) 
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